Skip to content

fix: Complete fix for plan mode system across all providers#679

Merged
Shironex merged 11 commits intov0.14.0rcfrom
feature/bug-complete-fix-for-the-plan-mode-system-inside-sbyt
Jan 25, 2026
Merged

fix: Complete fix for plan mode system across all providers#679
Shironex merged 11 commits intov0.14.0rcfrom
feature/bug-complete-fix-for-the-plan-mode-system-inside-sbyt

Conversation

@Shironex
Copy link
Collaborator

@Shironex Shironex commented Jan 24, 2026

Summary

This PR provides a comprehensive fix for the plan mode system, addressing all issues reported in #671 and related issues (#619, #627, #531, #660).

Issues Fixed

  • Non-Claude Provider Support: Plan mode now works with all AI providers (OpenAI, Gemini, Cursor, etc.), not just Claude SDK models
  • Crash/Restart Recovery: Features properly resume from where they left off after server restart or crash
  • Spec Todo List UI Updates: Task progress now updates in real-time on the feature card
  • Summary Extraction: Correct summary is now displayed from current execution, not stale previous runs
  • Worktree Mode Support: Confirmed fix for plan generation in worktree mode (Bug: Plan not working when "worktree mode" is set #619)

Technical Changes

  1. Type Consolidation: Moved ParsedTask and PlanSpec interfaces to @automaker/types for consistency across server and UI

  2. Fallback Spec Detection: Added detectSpecFallback() to detect generated specs even when models don't output the [SPEC_GENERATED] marker - detects structural elements like task blocks, acceptance criteria, problem statements, etc.

  3. Recovery System: Added resetStuckFeatures() that runs on auto-mode startup to reset:

    • Features stuck in in_progressready/backlog
    • Tasks stuck in in_progresspending
    • Plan generation stuck in generatingpending
  4. Task Status Persistence: Tasks in planSpec.tasks array now track individual completion status, allowing resume from last completed task

  5. New Events:

    • auto_mode_task_status - Emitted when task status changes
    • auto_mode_summary - Emitted when summary is extracted
  6. Summary Extraction: Added extractSummary() supporting multiple formats:

    • <summary> tags
    • ## Summary markdown sections
    • **Goal**: sections (lite mode)
    • **Problem**: sections (spec/full modes)
  7. UI Updates: Removed Claude model restriction from planning mode selectors - all models can now use planning features

Files Changed

File Changes
libs/types/src/feature.ts Added ParsedTask and PlanSpec interfaces
libs/types/src/index.ts Export new types
apps/server/src/services/auto-mode-service.ts Core fixes for all issues
apps/server/tests/unit/services/auto-mode-task-parsing.test.ts Added 20+ new tests
apps/ui/src/store/app-store.ts Import types from @automaker/types
apps/ui/src/hooks/use-auto-mode.ts Handle new events
apps/ui/src/hooks/use-query-invalidation.ts Invalidate on task updates
apps/ui/src/types/electron.d.ts New event type definitions
apps/ui/src/components/views/board-view/dialogs/*.tsx Enable planning for all models

Test plan

  • Unit tests for task parsing and spec detection pass (32 tests)
  • Build passes for packages and server
  • Manual testing: Create feature with spec/full plan mode using non-Claude model
  • Manual testing: Restart server during feature execution, verify resume works
  • Manual testing: Verify task progress updates in real-time on feature card
  • Manual testing: Verify correct summary appears in agent output

Closes #671

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Planning UI enabled for all models; planning mode selector always available.
    • Real-time per-task progress and concise feature/plan summaries surfaced in the UI.
    • Resume/recovery flow to continue persisted tasks and reset stuck features at auto-loop start.
    • Prompts updated to require a final block for clearer summaries.
  • Bug Fixes

    • Improved crash/recovery handling and more robust summary extraction.
  • Tests

    • E2E test added to verify planning-mode UI and behavior.
  • Refactor

    • Task/plan types centralized and exported for reuse; new event types for task status and summaries.

✏️ Tip: You can customize this high-level summary in your review settings.

Closes #671 (Complete fix for the plan mode system inside automaker)
Related: #619, #627, #531, #660

## Issues Fixed

### 1. Non-Claude Provider Support
- Removed Claude model restriction from planning mode UI selectors
- Added `detectSpecFallback()` function to detect specs without `[SPEC_GENERATED]` marker
- All providers (OpenAI, Gemini, Cursor, etc.) can now use spec and full planning modes
- Fallback detection looks for structural elements: tasks block, acceptance criteria,
  problem statement, implementation plan, etc.

### 2. Crash/Restart Recovery
- Added `resetStuckFeatures()` to clean up transient states on auto-mode start
- Features stuck in `in_progress` are reset to `ready` or `backlog`
- Tasks stuck in `in_progress` are reset to `pending`
- Plan generation stuck in `generating` is reset to `pending`
- `loadPendingFeatures()` now includes recovery cases for interrupted executions
- Persisted task status in `planSpec.tasks` array allows resuming from last completed task

### 3. Spec Todo List UI Updates
- Added `ParsedTask` and `PlanSpec` types to `@automaker/types` for consistent typing
- New `auto_mode_task_status` event emitted when task status changes
- New `auto_mode_summary` event emitted when summary is extracted
- Query invalidation triggers on task status updates for real-time UI refresh
- Task markers (`[TASK_START]`, `[TASK_COMPLETE]`, `[PHASE_COMPLETE]`) are detected
  and persisted to planSpec.tasks for UI display

### 4. Summary Extraction
- Added `extractSummary()` function to parse summaries from multiple formats:
  - `<summary>` tags (explicit)
  - `## Summary` sections (markdown)
  - `**Goal**:` sections (lite mode)
  - `**Problem**:` sections (spec/full modes)
  - `**Solution**:` sections (fallback)
- Summary is saved to `feature.summary` field after execution
- Summary is extracted from plan content during spec generation

### 5. Worktree Mode Support (#619)
- Recovery logic properly handles branchName filtering
- Features in worktrees maintain correct association during recovery

## Files Changed
- libs/types/src/feature.ts - Added ParsedTask and PlanSpec interfaces
- libs/types/src/index.ts - Export new types
- apps/server/src/services/auto-mode-service.ts - Core fixes for all issues
- apps/server/tests/unit/services/auto-mode-task-parsing.test.ts - New tests
- apps/ui/src/store/app-store.ts - Import types from @automaker/types
- apps/ui/src/hooks/use-auto-mode.ts - Handle new events
- apps/ui/src/hooks/use-query-invalidation.ts - Invalidate on task updates
- apps/ui/src/types/electron.d.ts - New event type definitions
- apps/ui/src/components/views/board-view/dialogs/*.tsx - Enable planning for all models

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@coderabbitai
Copy link

coderabbitai bot commented Jan 24, 2026

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

📝 Walkthrough

Walkthrough

Replaces local task/plan types with shared ParsedTask/PlanSpec; adds marker/spec detection, summary extraction, and recovery/resume logic to auto-mode; emits per-task and summary UI events; removes Claude-only gating from planning UI and adds an E2E test for planning-mode visibility.

Changes

Cohort / File(s) Summary
Types & exports
libs/types/src/feature.ts, libs/types/src/index.ts
Add/export ParsedTask and PlanSpec; switch Feature.planSpec to use PlanSpec.
Auto-mode service (server)
apps/server/src/services/auto-mode-service.ts
Import shared types; add marker detectors (detectTaskStartMarker, detectTaskCompleteMarker, detectPhaseCompleteMarker), spec detection (detectSpecFallback, extractSummary), recovery helpers (resetStuckFeatures), per-task recovery/resume, planSpec handling, summary persistence, and emit auto_mode_* events.
Server tests
apps/server/tests/unit/services/auto-mode-task-parsing.test.ts
Replace local ParsedTask test type with imported ParsedTask from @automaker/types.
Prompt defaults
libs/prompts/src/defaults.ts
Append mandatory final <summary> block to multiple auto-mode prompts and continuation templates.
UI: planning dialogs
apps/ui/src/components/views/board-view/dialogs/add-feature-dialog.tsx, .../edit-feature-dialog.tsx, .../mass-edit-dialog.tsx
Remove Claude-specific gating; always render PlanningModeSelect; clear requirePlanApproval when mode becomes skip/lite; simplify related UI branches.
UI: event handling & invalidation
apps/ui/src/hooks/use-auto-mode.ts, apps/ui/src/hooks/use-query-invalidation.ts
Handle new events auto_mode_task_status and auto_mode_summary; add task/phase events to per-feature invalidation list.
UI: store & electron types
apps/ui/src/store/app-store.ts, apps/ui/src/types/electron.d.ts
Import/export ParsedTask/PlanSpec; add auto_mode_task_status and auto_mode_summary variants to AutoModeEvent.
UI tests
apps/ui/tests/features/planning-mode-fix-verification.spec.ts
New Playwright E2E test validating planning-mode selector visibility/options and approval checkbox behavior across models.

Sequence Diagram(s)

sequenceDiagram
  participant UI as Client UI
  participant Server as AutoModeService
  participant Agent as LLM Agent
  participant DB as Persistence

  UI->>Server: start feature / request plan
  Server->>Agent: send planning prompt (streaming)
  Agent-->>Server: streaming plan content, markers, summaries
  Server->>Server: detect markers / detectSpecFallback / extractSummary
  alt spec explicit or fallback resolved
    Server->>DB: persist planSpec (content, tasks, version)
    Server->>UI: emit auto_mode_summary
  end
  loop per-task execution
    Server->>Agent: send per-task execution prompt
    Agent-->>Server: TASK_START
    Server->>DB: update task status (in_progress)
    Agent-->>Server: TASK_COMPLETE / PHASE_COMPLETE
    Server->>DB: persist task completion, saveFeatureSummary
    Server->>UI: emit auto_mode_task_status / task events
  end
  Server->>UI: emit final completion event
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

Poem

🐰 I hopped through plans both full and spec,
I sniffed for markers, found each fleck.
When agents stalled and trails went cold,
I stitched the tasks and kept the hold.
Now summaries bloom — a carroted gold. 🥕

🚥 Pre-merge checks | ✅ 3 | ❌ 2
❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 78.57% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Linked Issues check ❓ Inconclusive While the PR implements most technical requirements (multi-provider support, recovery mechanisms, UI updates, summary extraction, type consolidation), tester feedback indicates intermittent failures with summary extraction quality, duplicate resume attempts causing server errors, and occasional UI corruption. Address the reported race conditions in resume logic, improve summary extraction robustness to ensure consistency across all execution formats, and add defensive checks to prevent 'already running' errors.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main objective of the PR—fixing the plan mode system across all AI providers, which is the primary change addressing issue #671.
Out of Scope Changes check ✅ Passed All changes align with the stated objectives: type consolidation, recovery logic, marker detection, UI fixes for multi-provider support, event handling for task/summary updates, and prompt modifications for summary extraction.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @Shironex, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request delivers a significant overhaul to the plan mode system, addressing several critical issues related to provider compatibility, system resilience, and user experience. The changes ensure that the planning functionality is robust and accessible across a wider range of AI models, provides seamless recovery from interruptions, and offers more dynamic and accurate feedback on feature progress and summaries.

Highlights

  • Expanded Provider Compatibility: The planning mode now supports all AI providers (e.g., OpenAI, Gemini, Cursor), removing the previous restriction to Claude SDK models.
  • Enhanced Crash Recovery: The system can now properly resume feature execution from its last state after server crashes or restarts, preventing loss of progress.
  • Real-time UI Updates: The UI for "Spec Todo List" now updates task progress in real-time on the feature card, providing immediate feedback.
  • Accurate Summary Extraction: The system now extracts and displays the correct summary from the current execution, avoiding stale information from previous runs.
  • Improved Spec Detection: A new fallback mechanism detectSpecFallback() has been implemented to identify generated specifications even when models do not output explicit markers, improving compatibility with diverse AI outputs.
  • Task-Level Persistence: Individual task completion statuses within a plan are now persisted, allowing features to resume from the exact last completed task.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@Shironex Shironex self-assigned this Jan 24, 2026
@Shironex Shironex added Bug Something isn't working Testers-Requested Request for others to test an enhancement or bug fix/etc. Work-In-Progress Currently being addressed. Do Not Merge Use this label if something should not be merged. labels Jan 24, 2026
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request provides a comprehensive and well-executed fix for the plan mode system, making it provider-agnostic and significantly more robust. The changes, including crash/restart recovery, real-time UI updates, and improved summary extraction, are thoughtfully implemented across the server, UI, and shared type definitions. The addition of extensive unit and end-to-end tests further enhances the quality of this contribution. My review includes one suggestion to refactor a small piece of duplicated logic in the new summary extraction function to improve maintainability.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
apps/server/src/services/auto-mode-service.ts (1)

4363-4471: Mark tasks in_progress even without [TASK_START] markers.

If a model doesn’t emit [TASK_START], tasks stay pending until the end, which undercuts “real‑time” progress and persistence. Mark in_progress when the task begins and let marker detection override when present.

🛠️ Proposed fix
     logger.info(`Starting task ${task.id}: ${task.description}`);
     this.emitAutoModeEvent('auto_mode_task_started', {
       featureId,
       projectPath,
       branchName,
       taskId: task.id,
       taskDescription: task.description,
       taskIndex,
       tasksTotal: parsedTasks.length,
     });
+    await this.updateTaskStatus(projectPath, featureId, task.id, 'in_progress');

Also applies to: 4488-4492

🤖 Fix all issues with AI agents
In `@apps/server/src/services/auto-mode-service.ts`:
- Around line 3748-3762: The recovery branch collects existingApprovedPlan and
persistedTasks but never feeds them into the multi-agent task executor because
specDetected is set true and the plan-handling block is skipped; extract the
multi-agent task execution logic (the loop that processes ParsedTask items) into
a reusable helper (e.g., runMultiAgentTasks or executeTasksForFeature) and
invoke it from both the spec-detection path and the recovery path: when
planningModeRequiresApproval and feature.planSpec.status === 'approved' and
persistedTasks exists, call the helper with persistedTasks (and
existingApprovedPlan metadata) so task-level resumption actually runs; ensure
the helper signature accepts the same context/params used by the original loop
and remove duplicated logic in the specDetected branch.
- Around line 268-323: The extractSummary function currently returns the first
regex match and can pick up stale/older summaries from appended agent-output;
update extractSummary to prefer the last match for each pattern (for <summary>,
## Summary, **Goal**, **Problem/Problem Statement**, and **Solution**) by using
a global search (e.g., matchAll or iterating regex.exec with the g flag) and
selecting the final capture group result before trimming and truncating
(preserve the existing first-paragraph and length-truncation logic for each
case); ensure you replace each text.match(...) usage in extractSummary with
logic that finds the last match and then applies the same content extraction
steps.
🧹 Nitpick comments (7)
apps/server/tests/unit/services/auto-mode-task-parsing.test.ts (1)

339-374: Consider sharing detectSpecFallback logic instead of re-implementing it in tests.

A small helper exported from the service (or a utility module) would prevent test/production drift as detection patterns evolve.

apps/ui/src/types/electron.d.ts (1)

5-5: Avoid duplicating ParsedTask shape in event typing.

Reusing the shared type keeps UI events aligned with server types and reduces maintenance.

♻️ Suggested refactor
-import type { ClaudeUsageResponse, CodexUsageResponse } from '@/store/app-store';
+import type { ClaudeUsageResponse, CodexUsageResponse } from '@/store/app-store';
+import type { ParsedTask } from '@automaker/types';
@@
   | {
       type: 'auto_mode_task_status';
       featureId: string;
       projectPath?: string;
       taskId: string;
-      status: 'pending' | 'in_progress' | 'completed' | 'failed';
-      tasks: Array<{
-        id: string;
-        description: string;
-        filePath?: string;
-        phase?: string;
-        status: 'pending' | 'in_progress' | 'completed' | 'failed';
-      }>;
+      status: ParsedTask['status'];
+      tasks: ParsedTask[];
     }

Also applies to: 337-356

apps/ui/src/components/views/board-view/dialogs/edit-feature-dialog.tsx (2)

1-1: Consider removing @ts-nocheck directive.

This directive suppresses all TypeScript errors in the file, which can hide real type issues. If there are specific type issues preventing compilation, consider addressing them directly or using targeted @ts-expect-error comments with explanations.


472-491: Dead code: else branch is unreachable.

Since modelSupportsPlanningMode is now always true, this entire else branch will never execute. The tooltip text "Planning modes are only available for Claude Provider" is also now incorrect. Consider removing this dead code to improve maintainability.

♻️ Suggested cleanup
-              {modelSupportsPlanningMode ? (
                 <PlanningModeSelect
                   mode={planningMode}
                   onModeChange={setPlanningMode}
                   testIdPrefix="edit-feature-planning"
                   compact
                 />
-              ) : (
-                <TooltipProvider>
-                  <Tooltip>
-                    <TooltipTrigger asChild>
-                      <div>
-                        <PlanningModeSelect
-                          mode="skip"
-                          onModeChange={() => {}}
-                          testIdPrefix="edit-feature-planning"
-                          compact
-                          disabled
-                        />
-                      </div>
-                    </TooltipTrigger>
-                    <TooltipContent>
-                      <p>Planning modes are only available for Claude Provider</p>
-                    </TooltipContent>
-                  </Tooltip>
-                </TooltipProvider>
-              )}
apps/ui/src/components/views/board-view/dialogs/mass-edit-dialog.tsx (1)

305-337: Dead code: else branch is unreachable.

Since modelSupportsPlanningMode is now always true, this entire branch (lines 305-337) containing the disabled planning mode with Claude-only tooltip will never execute. Consider removing this dead code.

♻️ Suggested cleanup
         {/* Planning Mode */}
-        {modelSupportsPlanningMode ? (
           <FieldWrapper
             label="Planning Mode"
             isMixed={mixedValues.planningMode || mixedValues.requirePlanApproval}
             willApply={applyState.planningMode || applyState.requirePlanApproval}
             onApplyChange={(apply) =>
               setApplyState((prev) => ({
                 ...prev,
                 planningMode: apply,
                 requirePlanApproval: apply,
               }))
             }
           >
             <PlanningModeSelect
               mode={planningMode}
               onModeChange={(newMode) => {
                 setPlanningMode(newMode);
                 // Auto-suggest approval based on mode, but user can override
                 setRequirePlanApproval(newMode === 'spec' || newMode === 'full');
               }}
               requireApproval={requirePlanApproval}
               onRequireApprovalChange={setRequirePlanApproval}
               testIdPrefix="mass-edit-planning"
             />
           </FieldWrapper>
-        ) : (
-          <TooltipProvider>
-            <Tooltip>
-              <TooltipTrigger asChild>
-                <div
-                  className={cn(
-                    'p-3 rounded-lg border transition-colors border-border bg-muted/20 opacity-50 cursor-not-allowed'
-                  )}
-                >
-                  <div className="flex items-center justify-between mb-3">
-                    <div className="flex items-center gap-2">
-                      <Checkbox checked={false} disabled className="opacity-50" />
-                      <Label className="text-sm font-medium text-muted-foreground">
-                        Planning Mode
-                      </Label>
-                    </div>
-                  </div>
-                  <div className="opacity-50 pointer-events-none">
-                    <PlanningModeSelect
-                      mode="skip"
-                      onModeChange={() => {}}
-                      testIdPrefix="mass-edit-planning"
-                      disabled
-                    />
-                  </div>
-                </div>
-              </TooltipTrigger>
-              <TooltipContent>
-                <p>Planning modes are only available for Claude Provider</p>
-              </TooltipContent>
-            </Tooltip>
-          </TooltipProvider>
-        )}
apps/ui/src/components/views/board-view/dialogs/add-feature-dialog.tsx (2)

1-1: Consider removing @ts-nocheck directive.

Same concern as in edit-feature-dialog.tsx - this suppresses all TypeScript errors in the file. Consider addressing specific type issues instead.


580-599: Dead code: else branch is unreachable.

Since modelSupportsPlanningMode is now always true, this branch with the disabled planning selector and "Planning modes are only available for Claude Provider" tooltip will never execute. Consider removing this dead code for consistency with the recommended cleanup in the other dialog files.

♻️ Suggested cleanup
-              {modelSupportsPlanningMode ? (
                 <PlanningModeSelect
                   mode={planningMode}
                   onModeChange={setPlanningMode}
                   testIdPrefix="add-feature-planning"
                   compact
                 />
-              ) : (
-                <TooltipProvider>
-                  <Tooltip>
-                    <TooltipTrigger asChild>
-                      <div>
-                        <PlanningModeSelect
-                          mode="skip"
-                          onModeChange={() => {}}
-                          testIdPrefix="add-feature-planning"
-                          compact
-                          disabled
-                        />
-                      </div>
-                    </TooltipTrigger>
-                    <TooltipContent>
-                      <p>Planning modes are only available for Claude Provider</p>
-                    </TooltipContent>
-                  </Tooltip>
-                </TooltipProvider>
-              )}

@Shironex Shironex removed the Work-In-Progress Currently being addressed. label Jan 24, 2026
@Monoquark
Copy link
Contributor

Tested using GLM 4.7 Coder via OpenCode:

Manual testing: Create feature with spec/full plan mode using non-Claude model -- WORKS
Manual testing: Restart server during feature execution, verify resume works -- WORKS MOSTLY (see below)
Manual testing: Verify task progress updates in real-time on feature card -- WORKS
Manual testing: Verify correct summary appears in agent output -- MAYBE? ("The authentication context provider for task T018 has been fully implemented and is already in place:", cut off mid sentence apparently)

Notes:

  • Errors on one card trying to resume a feature that was already in progress, lead to regular client crashes and restarts:
INFO  [AutoMode] Resuming feature: user-authentication (User Authentication System)
ERROR [Server] ERROR [AutoMode] Failed to resume feature user-authentication: Error: already running
    at AutoModeService.resumeFeature (/Users/kaykobusch/Development_local/automaker_testing/apps/server/src/services/auto-mode-service.ts:1797:13)
    at AutoModeService.resumeInterruptedFeatures (/Users/kaykobusch/Development_local/automaker_testing/apps/server/src/services/auto-mode-service.ts:5201:22)
    at async <anonymous> (/Users/kaykobusch/Development_local/automaker_testing/apps/server/src/routes/auto-mode/routes/resume-interrupted.ts:30:7)
INFO  [Server] POST /api/auto-mode/resume-interrupted 200

This card was resumed and finished.

Happened on another card with the same error. Stopping the offending card 'fixed' the crashes. Corrupted state after stopping it: The card kept showing the task list and did not have the Edit/Make buttons at the bottom - even though it was in the backlog after stopping. Not movable via mouse. Could be restarted via Auto mode switch though. Lead to the client losing the connection to the server, but the output window was still showing the server sending events.

…ode-system-inside-sbyt

Resolved conflict in apps/ui/src/hooks/use-query-invalidation.ts by:
- Keeping the refactored structure from v0.14.0rc (using constants and hasFeatureId() type guard)
- Adding the additional event types from the feature branch (auto_mode_task_status, auto_mode_summary) to SINGLE_FEATURE_INVALIDATION_EVENTS constant

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Fix all issues with AI agents
In `@apps/server/src/services/auto-mode-service.ts`:
- Around line 4644-4665: The emitted event name 'auto_mode_task_start' doesn't
match the typed/handled event 'auto_mode_task_started' and will be ignored;
update the emit call in the auto-mode flow to use the correct event name by
changing the string in emitAutoModeEvent(...) from 'auto_mode_task_start' to
'auto_mode_task_started', and also search for other occurrences of
'auto_mode_task_start' to either rename them or add the matching union variant
in AutoModeEvent so the type definitions and UI handlers align with the emitted
event.
- Around line 4212-4221: This block double-emits the 'auto_mode_summary' event
because saveFeatureSummary(projectPath, featureId, summary) already emits it;
remove the explicit this.emitAutoModeEvent('auto_mode_summary', { featureId,
projectPath, summary }) call from this recovery path (leave extractSummary(...)
and the await saveFeatureSummary(...) intact) so the summary is only emitted
once by saveFeatureSummary; alternatively, if you prefer central emission,
remove the emission from saveFeatureSummary and document that emit
responsibility moves to the caller, but do not keep both.
- Around line 4126-4137: The recovery task is calling provider.executeQuery with
bareModel which bypasses providerResolvedModel mapping; update the executeQuery
call in the recovery path to use effectiveBareModel (the variable already
computed) instead of bareModel so the resolved model ID is used consistently for
non‑Claude providers—locate the provider.executeQuery invocation that returns
taskStream and swap the bareModel argument for effectiveBareModel.

In `@apps/ui/src/components/views/board-view/dialogs/add-feature-dialog.tsx`:
- Around line 592-602: The requirePlanApproval checkbox can remain true when
planningMode is 'skip' or 'lite'; update the component to normalize that by
clearing or forcing false for the requirePlanApproval state whenever
planningMode changes to 'skip' or 'lite' and by masking the checkbox checked
prop when planningMode is 'skip' or 'lite' (the checkbox with id
"add-feature-require-approval" / data-testid
"add-feature-require-approval-checkbox" and the requirePlanApproval state
variable). Implement this via a useEffect watching planningMode to
setRequirePlanApproval(false) when mode is 'skip'|'lite', and ensure the input's
checked uses planningMode === 'skip' || planningMode === 'lite' ? false :
requirePlanApproval so the payload built in the submit function (where
requirePlanApproval is read) never carries true while planning is disabled; keep
behavior unchanged for other modes and mirror logic used in PlanningModeSelect.

@Monoquark
Copy link
Contributor

Monoquark commented Jan 24, 2026

Update for my test report:

Tested using GLM 4.7 Coder via OpenCode:

Manual testing: Create feature with spec/full plan mode using non-Claude model -- WORKS RELIABLY

Manual testing: Restart server during feature execution, verify resume works -- WORKS, earlier issues could not be reproduced and might have been me stinging on RAM while using Electron

Manual testing: Verify task progress updates in real-time on feature card -- WORKS

>> Manual testing: Verify correct summary appears in agent output

  • 5/8 finished features received summaries, 3 did not
  • One summary again seems to break after ":", full summary: "Added route change event listener system to js/router.js:8:"
  • One summary was probably stupid model, instead of the DB access layer implementation summary it gave me a testing summary - after 17 tasks!!
Task T016 completed successfully:

Ran Playwright test suite with npm test
All 70 tests passed across 6 test files:
Chart Rendering tests: 23 tests
CRUD Operations tests: 14 tests
Database Singleton tests: 6 tests
Offline Functionality tests: 11 tests
Sample Data Generation tests: 15 tests
Schema Initialization tests: 5 tests
Test execution time: 6.8 seconds
No test failures or issues to fixI'll help you delete the temporary verification test file. Let me first s...

- Changed model references from `bareModel` to `effectiveBareModel` in multiple locations to ensure consistency.
- Removed redundant event emission for `auto_mode_summary` after saving feature summaries.
- Added checks to prevent resuming features that are already running, enhancing error handling.
- Introduced a new useEffect in various dialogs to clear `requirePlanApproval` when planning mode is set to 'skip' or 'lite'.
- Updated prompt templates to enforce a structured summary output format, ensuring critical information is captured after task completion.
…ialog

- Changed the data-testid from "add-feature-require-approval-checkbox" to "add-feature-planning-require-approval-checkbox" for better clarity and consistency in testing.
@Monoquark
Copy link
Contributor

Another test run with 10 features, only two of them with incomplete / irrelevant summaries. For some reason those feature cards did not show the tasks while in progress but the task list was visible in the Logs view so I let them run. The summaries were again about the tests, or sth like "wow, I didnt need to do much, somebody else did all my work". Have not yet found a pattern, nothing in the CLI log.

@Shironex
Copy link
Collaborator Author

Another test run with 10 features, only two of them with incomplete / irrelevant summaries. For some reason those feature cards did not show the tasks while in progress but the task list was visible in the Logs view so I let them run. The summaries were again about the tests, or sth like "wow, I didnt need to do much, somebody else did all my work". Have not yet found a pattern, nothing in the CLI log.

did u pull the latest changes i added to include proper summaries blocks in prompts?

@Monoquark
Copy link
Contributor

Monoquark commented Jan 25, 2026

Another test run with 10 features, only two of them with incomplete / irrelevant summaries. For some reason those feature cards did not show the tasks while in progress but the task list was visible in the Logs view so I let them run. The summaries were again about the tests, or sth like "wow, I didnt need to do much, somebody else did all my work". Have not yet found a pattern, nothing in the CLI log.

did u pull the latest changes i added to include proper summaries blocks in prompts?

I did. Will keep an eye out for if it happens again.

On an unrelated note, I don't know why but suddenly my Auto Mode is limited to 1 agent. Not sure where that came from, I was using 5 in parallel even on this PR branch earlier.

- Added functionality to extract and save the final summary from multi-task or single-agent execution in the auto mode service.
- Updated event types in the query invalidation hook to include 'auto_mode_task_started' and 'auto_mode_task_complete' for better event handling.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
apps/server/src/services/auto-mode-service.ts (1)

3978-3981: Avoid starting the main stream when the recovery path will return early.
provider.executeQuery(...) is created before the recovery short‑circuit, so it can fire an external call that is never consumed.

🛠️ Proposed fix
-    logger.info(`Starting stream for feature ${featureId}...`);
-    const stream = provider.executeQuery(executeOptions);
-    logger.info(`Stream created, starting to iterate...`);
     // Initialize with previous content if this is a follow-up, with a separator
     let responseText = previousContent
       ? `${previousContent}\n\n---\n\n## Follow-up Session\n\n`
       : '';
@@
     if (existingApprovedPlan && persistedTasks && persistedTasks.length > 0) {
       logger.info(
         `Recovery: Resuming task execution for feature ${featureId} with ${persistedTasks.length} tasks`
       );
@@
       await writeToFile();
       return;
     }
+
+    logger.info(`Starting stream for feature ${featureId}...`);
+    const stream = provider.executeQuery(executeOptions);
+    logger.info(`Stream created, starting to iterate...`);

Also applies to: 4070-4226

🤖 Fix all issues with AI agents
In `@apps/server/src/services/auto-mode-service.ts`:
- Around line 507-557: The resetStuckFeatures function resets stuck task
statuses but doesn't clear feature.currentTaskId, causing UIs to still point at
reverted tasks; update resetStuckFeatures (inside the loop over
feature.planSpec.tasks) so that when you set a task.status from 'in_progress' to
'pending' you also clear feature.currentTaskId if it equals that task.id (and
mark needsUpdate accordingly), ensuring the change is persisted when write-back
occurs.

…lete-fix-for-the-plan-mode-system-inside-sbyt
- Added logic to clear the currentTaskId for a feature if it points to a reverted task, improving task management and logging clarity.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
apps/server/src/services/auto-mode-service.ts (1)

3985-4083: Avoid starting a stream before the recovery short‑circuit.

provider.executeQuery(...) is invoked before the recovery path returns, so recovery runs still create an unused stream (wasted API call + potential resource leak). Gate stream creation until after the recovery check.

🛠️ Proposed fix
-    // Execute via provider
-    logger.info(`Starting stream for feature ${featureId}...`);
-    const stream = provider.executeQuery(executeOptions);
-    logger.info(`Stream created, starting to iterate...`);
     // Initialize with previous content if this is a follow-up, with a separator
     let responseText = previousContent
       ? `${previousContent}\n\n---\n\n## Follow-up Session\n\n`
       : '';
@@
-    // RECOVERY PATH: If we have an approved plan with persisted tasks, skip spec generation
+    // RECOVERY PATH: If we have an approved plan with persisted tasks, skip spec generation
     // and directly execute the remaining tasks
     if (existingApprovedPlan && persistedTasks && persistedTasks.length > 0) {
       ...
       return;
     }
+
+    // Execute via provider (only after recovery short-circuit)
+    logger.info(`Starting stream for feature ${featureId}...`);
+    const stream = provider.executeQuery(executeOptions);
+    logger.info(`Stream created, starting to iterate...`);
🤖 Fix all issues with AI agents
In `@apps/server/src/services/auto-mode-service.ts`:
- Around line 546-561: The reset logic clears feature.currentTaskId but leaves
planSpec.currentTaskId stale; update the code in the reset loop (inside
resetStuckFeatures handling planSpec.tasks) to clear
feature.planSpec.currentTaskId when it equals the reverted task id (and set
needsUpdate = true and log the change), or clear both fields if both exist,
ensuring the logger message references planSpec.currentTaskId being cleared
rather than feature.currentTaskId so the UI/recovery won't still point at the
reverted task.

Shironex and others added 2 commits January 25, 2026 13:41
…lete-fix-for-the-plan-mode-system-inside-sbyt

Resolved conflicts in dialog components by keeping simplified code
without modelSupportsPlanningMode conditional (always true now).
… resetStuckFeatures

Address CodeRabbit review comment: The reset logic was incorrectly
clearing feature.currentTaskId (which doesn't exist on Feature type)
instead of feature.planSpec.currentTaskId. This left planSpec.currentTaskId
stale, causing UI/recovery to still point at reverted tasks.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
apps/server/src/services/auto-mode-service.ts (1)

3985-4233: Avoid starting the main agent stream when recovery will short‑circuit.
executeQuery is invoked before the recovery early return, which can trigger an unnecessary API call and stray output. Move stream creation after the recovery branch.

🐛 Proposed fix (delay stream creation until after recovery)
-    const stream = provider.executeQuery(executeOptions);
-    logger.info(`Stream created, starting to iterate...`);
-
     // RECOVERY PATH: If we have an approved plan with persisted tasks, skip spec generation
     // and directly execute the remaining tasks
     if (existingApprovedPlan && persistedTasks && persistedTasks.length > 0) {
       ...
       return;
     }
+
+    const stream = provider.executeQuery(executeOptions);
+    logger.info(`Stream created, starting to iterate...`);
🤖 Fix all issues with AI agents
In `@apps/server/src/services/auto-mode-service.ts`:
- Around line 228-265: detectSpecFallback currently only recognizes fenced
```tasks blocks or checklist lines via hasTasksBlock/hasTaskLines, which misses
planningLiteWithApproval outputs that use unstructured numbered tasks (e.g.,
"**Tasks**: 1. Create X, 2. Add Y") and so can stall when [SPEC_GENERATED] is
absent; update detectSpecFallback to also detect a tasks header and numbered
list patterns (for example add regexes for /\*\*Tasks\*\*:/i and
/(^|\s)\d+\.\s+[A-Za-z0-9]/ or a multiline numbered-list pattern) and include
those in hasTaskStructure (or add a new hasTasksHeader variable) so that
detectSpecFallback returns true for planningLiteWithApproval-style outputs while
keeping the existing spec-content checks; refer to the detectSpecFallback
function, hasTasksBlock, hasTaskLines, and the planningLiteWithApproval scenario
when making the change.
♻️ Duplicate comments (1)
apps/ui/src/components/views/board-view/dialogs/add-feature-dialog.tsx (1)

597-605: Consider masking checked when approval is disabled.

There’s still a brief window after switching to skip/lite where the checkbox can remain checked before the effect clears state; aligning with PlanningModeSelect’s masking avoids transient UI inconsistency and eliminates any chance of a stale payload.

♻️ Optional tweak
-                      checked={requirePlanApproval}
+                      checked={requirePlanApproval && planningMode !== 'skip' && planningMode !== 'lite'}

…lete-fix-for-the-plan-mode-system-inside-sbyt

Resolved conflict in auto-mode-service.ts by keeping the v0.14.0rc version
which uses isFeatureRunning() method and has more informative logging.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
apps/server/src/services/auto-mode-service.ts (2)

3978-4397: Avoid starting the main provider stream when taking the recovery path.
At Line 4241 the recovery branch returns early, but the main provider.executeQuery(...) stream is already created above (Line 4150). This can trigger an unused model call and leak resources. Move stream creation below the recovery early‑return (or abort it) so only one execution path runs.

🛠️ Suggested fix
-    // Execute via provider
-    logger.info(`Starting stream for feature ${featureId}...`);
-    const stream = provider.executeQuery(executeOptions);
-    logger.info(`Stream created, starting to iterate...`);
...
     if (existingApprovedPlan && persistedTasks && persistedTasks.length > 0) {
       ...
       return;
     }
+
+    // Execute via provider (normal path only)
+    logger.info(`Starting stream for feature ${featureId}...`);
+    const stream = provider.executeQuery(executeOptions);
+    logger.info(`Stream created, starting to iterate...`);

4739-4987: Potential double‑emission of auto_mode_task_started.
You emit a start event at Line 4754 and then emit again when [TASK_START] appears (Line 4824), which can duplicate UI updates. Consider marking the start as already emitted before streaming (or skip the marker‑based emit when already started).

🛠️ Suggested fix
                     this.emitAutoModeEvent('auto_mode_task_started', {
                       featureId,
                       projectPath,
                       branchName,
                       taskId: task.id,
                       taskDescription: task.description,
                       taskIndex,
                       tasksTotal: parsedTasks.length,
                     });
+                    taskStartDetected = true;
♻️ Duplicate comments (1)
apps/server/src/services/auto-mode-service.ts (1)

228-266: Fallback spec detection misses lite-with-approval numbered tasks.
Line 239–255 only recognizes ```tasks blocks or “- [ ] T###” lines, so outputs like “Tasks: 1. …” won’t be detected when [SPEC_GENERATED] is absent, which can stall non‑Claude planning. Consider accepting numbered lists behind a Tasks header.

🛠️ Suggested fix
-  const hasTaskLines = /- \[ \] T\d{3}:/.test(text);
+  const hasTaskLines = /- \[ \] T\d{3}:/.test(text);
+  const hasTasksHeader = /\*\*Tasks\*\*:/i.test(text);
+  const hasNumberedTasks = /(?:^|\n)\s*\d+\.\s+\S+/m.test(text);
...
-  const hasTaskStructure = hasTasksBlock || hasTaskLines;
+  const hasTaskStructure =
+    hasTasksBlock || hasTaskLines || (hasTasksHeader && hasNumberedTasks);

@Shironex Shironex merged commit bf25a7a into v0.14.0rc Jan 25, 2026
7 checks passed
@Shironex Shironex deleted the feature/bug-complete-fix-for-the-plan-mode-system-inside-sbyt branch January 25, 2026 15:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Bug Something isn't working Do Not Merge Use this label if something should not be merged. Testers-Requested Request for others to test an enhancement or bug fix/etc.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments